Extracting features from text to improve statistical machine translation
نویسندگان
چکیده
منابع مشابه
Text Segmentation Criteria for Statistical Machine Translation
For several reasons machine translation systems are today unsuited to process long texts in one shot. In particular, in statistical machine translation, heuristic search algorithms are employed whose level of approximation depends on the length of the input. Moreover, processing time can be a bottleneck with long sentences, whereas multiple text chunks can be quickly processed in parallel. Henc...
متن کاملOptimizing Statistical Machine Translation for Text Simplification
Most recent sentence simplification systems use basic machine translation models to learn lexical and syntactic paraphrases from a manually simplified parallel corpus. These methods are limited by the quality and quantity of manually simplified corpora, which are expensive to build. In this paper, we conduct an indepth adaptation of statistical machine translation to perform text simplification...
متن کاملLinguistic Input Features Improve Neural Machine Translation
Neural machine translation has recently achieved impressive results, while using little in the way of external linguistic information. In this paper we show that the strong learning capability of neural MT models does not make linguistic features redundant; they can be easily incorporated to provide further improvements in performance. We generalize the embedding layer of the encoder in the att...
متن کاملDiscourse-level features for statistical machine translation
The talk will show how the disambiguation of discourse connectives can improve their automatic translation. Connectives are a class of frequent functional lexical items that play an important role in text readability and coherence. Longer-range context is taken into account to learn the signaled rhetorical relations. The labels obtained from a discourse connective classifier are then integrated...
متن کامل11,001 New Features for Statistical Machine Translation
We use the Margin Infused Relaxed Algorithm of Crammer et al. to add a large number of new features to two machine translation systems: the Hiero hierarchical phrasebased translation system and our syntax-based translation system. On a large-scale ChineseEnglish translation task, we obtain statistically significant improvements of +1.5 B and +1.1 B, respectively. We analyze the impact of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Applied Linguistics and Lexicography
سال: 2019
ISSN: 2687-0215
DOI: 10.33910/2687-0215-2019-1-1-12-17